A preference learning approach to sentence ordering for multi-document summarization
نویسندگان
چکیده
Ordering information is a difficult but an important task for applications generating naturallanguage texts such as multi-document summarization, question answering, and conceptto-text generation. In multi-document summarization, information is selected from a set of source documents. Therefore, the optimal ordering of those selected pieces of information to create a coherent summary is not obvious. Improper ordering of information in a summary can both confuse the reader and deteriorate the readability of the summary. Therefore, it is vital to properly order the information in multi-document summarization. We model the problem of sentence ordering in multi-document summarization as a one of learning the optimal combination of preference experts that determine the ordering between two given sentences. To capture the preference of a sentence against another sentence, we define five preference experts: chronology, probabilistic, topical-closeness, precedence, and succession. We use summaries ordered by human annotators as training data to learn the optimal combination of the different preference experts. Finally, the learnt combination is applied to order sentences extracted in a multi-document summarization system. The proposed sentence ordering algorithm considers pairwise comparisons between sentences to determine a total ordering, using a greedy search algorithm, thereby avoiding the combinatorial time complexity typically associated with total ordering tasks. This enables us to efficiently order sentences in longer summaries, thereby rendering the proposed approach useable in real-world text summarization systems. We evaluate the sentence orderings produced by the proposed method and numerous other baselines using both semi-automatic evaluation measures as well as performing a subjective evaluation.
منابع مشابه
An Effective Sentence Ordering Approach For Multi-Document Summarization Using Text Entailment
With the rapid development of modern technology electronically available textual information has increased to a considerable amount. Summarization of textual information manually from unstructured text sources creates overhead to the user, therefore a systematic approach is required. Summarization is an approach that focuses on providing the user with a condensed version of the original text bu...
متن کاملA Bottom-Up Approach to Sentence Ordering for Multi-Document Summarization
Ordering information is a difficult but important task for applications generating natural-language text. We present a bottom-up approach to arranging sentences extracted for multi-document summarization. To capture the association and order of two textual segments (eg, sentences), we define four criteria, chronology, topical-closeness, precedence, and succession. These criteria are integrated ...
متن کاملSentence Ordering based on Cluster Adjacency in Multi-Document Summarization
In this paper, we propose a cluster-adjacency based method to order sentences for multi-document summarization tasks. Given a group of sentences to be organized into a summary, each sentence was mapped to a theme in source documents by a semi-supervised classification method, and adjacency of pairs of sentences is learned from source documents based on adjacency of clusters they belong to. Then...
متن کاملSentence ordering with manifold-based classification in multi-document summarization
In this paper, we propose a sentence ordering algorithm using a semi-supervised sentence classification and historical ordering strategy. The classification is based on the manifold structure underlying sentences, addressing the problem of limited labeled data. The historical ordering helps to ensure topic continuity and avoid topic bias. Experiments demonstrate that the method is effective.
متن کاملSignificance of Sentence Ordering in Multi Document Summarization
Multi-document summarization represents the information in a concise and comprehensive manner. In this paper we discuss the significance of ordering of sentences in multi document summarization. We show experimental results on DUC2002 dataset. These results show the ordering of summaries before and, improvement in this, after applying sentence ordering. For this purpose we used a term frequency...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Sci.
دوره 217 شماره
صفحات -
تاریخ انتشار 2012